Constraining the response of continental-scale groundwater flow to climate change

Numerical models of groundwater flow play a critical role for water management scenarios under climate extremes. Large-scale models play a key role in determining long range flow pathways from continental interiors to the oceans, yet struggle to simulate the local flow patterns offered by small-scale models. We have developed a highly scalable numerical framework to model continental groundwater flow which capture the intricate flow pathways between deep aquifers and the near-surface. The coupled thermal-hydraulic basin structure is inferred from hydraulic head measurements, recharge estimates from geochemical proxies, and borehole temperature data using a Bayesian framework. We use it to model the deep groundwater flow beneath the Sydney–Gunnedah–Bowen Basin, part of Australia’s largest aquifer system. Coastal aquifers have flow rates of up to 0.3 m/day, and a corresponding groundwater residence time of just 2,000 years. In contrast, our model predicts slow flow rates of 0.005 m/day for inland aquifers, resulting in a groundwater residence time of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim $$\end{document}∼ 400,000 years. Perturbing the model to account for a drop in borehole water levels since 2000, we find that lengthened inland flow pathways depart significantly from pre-2000 streamlines as groundwater is drawn further from recharge zones in a drying climate. Our results illustrate that progressively increasing water extraction from inland aquifers may permanently alter long-range flow pathways. Our open-source modelling approach can be extended to any basin and may help inform policies on the sustainable management of groundwater.

The velocity field of continent-scale groundwater flow is broadly characterised by recharge in regions of high topography, where the water table is elevated, and discharge from continental slopes into the ocean 1 . While this approximation does not hold at smaller scales where spatial variations in the permeability due to structural and lithological inhomogeneities control the flow pathways 2 , models which encompass multi-scale groundwater flow patterns offer an opportunity to understand the hydrologic cycle under a changing climate 3 . Most continent to global-scale groundwater studies have focused on the variations of water table depth 4 , shallow groundwater modelling in unconfined aquifers 5,6 , or regional flow patterns subject to climate forcing 7 . Combining observations with models to predict groundwater sustainability in response to climate change is a global challenge 8 , and datadriven continental-scale models which encompass deep aquifers controlled by geology represents a particular gap in knowledge. Such models present a significant computational challenge to operate at sufficient resolution honouring deep geometrical and hydraulic structures, particularly where it is necessary to run ensembles to explore uncertainty of the hydraulic regime. We present a numerical framework for coupled hydro-thermal modelling to resolve the flow dynamics within deep aquifers of continental-scale basins. The framework is designed for efficient parallel scaling on high performance computing (HPC) infrastructure and has been built to integrate common hydraulic observations, including hydraulic head measurements in boreholes and recharge estimates from the chloride mass balance method 9 . Data assimilation uses a Bayesian framework, where the hydraulic conductivity of each geological layer in the basin is inverted to an optimal value consistent with the constraining data.
Motivation for groundwater modelling in eastern Australia. Groundwater is a crucial resource in Australia, the world's driest inhabited continent. Increasing aridity and agricultural water-utilisation have placed progressive strain on Australia's inland water resources, and Australia's reliance on groundwater is expected to increase due to population growth and a higher propensity for drought 10 . Changes in groundwater depth are known to affect the susceptibility of regions to changes in temperature and precipitation 11 , which is compounded by increased evapotranspiration due to a warmer climate and, in-turn, a diminished capacity for groundwater storage 12 . While an administratively strict licensing system exists for groundwater use in Australia, this relies on the limited knowledge about the capacities and flow rates in aquifers interpreted from spatially sparse borehole data across a large region 10,13 . The Sydney-Gunnedah-Bowen (SGB) Basin comprises part of the Great Artesian Basin, the largest and deepest artesian basin in the world, and supports agricultural production, mining, industry and urban activities across much of the eastern third of the continent (Fig. 1). Regional groundwater levels have declined since 2000 due to pumping of shallow alluvial sediments 10 , yet the rate of extraction to maintain environmental flows is unknown despite the risk of permanently reducing groundwater storage 14 , and increasing groundwater and river salinity from redirected flow pathways 15 . Rates of groundwater flow have previously been measured using isotopic tracers obtained from shallow boreholes 16 , and recharge rates have been inferred from the concentration of chloride in rainfall and groundwater 9 to estimate local flow rates in specific areas of the SGB Basin. However, the flow dynamics within deep aquifers is largely unconstrained and the degree to which deep aquifers recharge the more productive and higher quality shallow alluvial sediments at the basin scale is relatively unexplored. Existing workflows used to model groundwater assume topography-driven groundwater flow from regions of high to low elevation 17 , but they struggle to capture the intricate flow pathways between deeper aquifers and the surface within continental-scale basins such as the SGB Basin. Our inverse modelling framework accomplishes this by making the following advances: (i) solving the equations that govern groundwater flow using a finite-element particle-in-cell method on deformed meshes 18 , (ii) resolving multi-scale flow patterns and communication between deep and near-surface aquifers over continent-scale domains, (iii) casting the inversion of hydraulic properties of aquifers using a Bayesian approach.
Regional groundwater dynamics in eastern Australia. The bulk flow field of the SGB system is largely fed by recharge from the Great Dividing Range and discharged out to the Tasman Sea with previously estimated residence times ranging from < 70 years to > 100 000 years 16 . Smaller topographic variations overprint the regional flow pattern. Flow rates are generally determined by the hydraulic conductivity of sedimentary layers within the basin. Permeable rocks, such as Jurassic sediments, are efficient aquifers, whereas rift volcanics deposited during the Early Permian generally behave as aquitards that limit the flow of water. It has long been recognised that shallow alluvial sediments are partly recharged by leakage from deeper aquifers of the SGB Basin, which comprise the eastern portion of the Great Artesian Basin 16,21 , although the scale of this feedback is poorly understood. Through integrating data with a geological model of basin architecture, we are able to probe the complex fluid pathways from recharge to discharge and the degree of mixing between each layer.
The SGB Basin architecture is described here by a deep 3D model constrained by gravity modelling and lithology data from core samples 22,23 . The region contains abundant Permian-Jurassic black coal measures and mafic volcanic material interspersed with sandstone and mudstone layers. We populate a 3D model of basin architecture ( Fig. 2) with initial thermal and hydraulic material properties described in Table 1. These starting values of lithology-dependent thermal and hydraulic parameters are reference values obtained from material averages in the literature [24][25][26] and form the a priori information used within the inversion (refer to "Methods" section). The 3D stratigraphic model does not resolve the individual formation-scale rock units of the Sydney Basin, but condenses these formations into the dominant lithologies that adequately describe the geological heterogeneity across large regions. The thermal and hydrological properties that are mapped to layers in the 3D model are adjusted by temperature-dependent and depth-dependent relationships within our numerical modelling workflow (refer to "Methods" for a detailed description of the governing equations).

Data assimilation
Each borehole facilitates an independent observation of the hydro-thermal field which jointly provides valuable constraints on subsurface temperature and the groundwater flow field in our coupled system. Our approach here is to cast the inverse problem of coupled heat conduction and groundwater flow within a Bayesian framework, which inverts lithology-dependent hydraulic conductivity ( k h ), thermal conductivity ( k T ), rate of heat production (H), and the lower temperature boundary condition ( T 1 ) from borehole data (refer to the "Methods" section Multiple data types are used to constrain our coupled groundwater-heatflow models, including maps of (A) elevation of the study area overlain with borehole measurements of temperature gradients 19 and groundwater recharge rates 9 . Blue lines indicate major perennial streams. (B) The depth to the water overlain with the hydraulic pressure head ( h P = P/ρ w g in metres) measured in groundwater boreholes (Table S1). Satellite imagery was obtained from the ETOPO1 global relief model 20 .  Hydraulic head measurements. We use the database of groundwater level monitoring sites from the National Groundwater Information System (NGIS, http:// www. bom. gov. au/ water/ groun dwater/ ngis/) to constrain hydraulic head within major aquifers of the SGB Basin. 15,018 monitoring boreholes have been extracted from the NGIS and filtered to include groundwater level observations prior to the year 2000 (Table S1). This study is primarily focused on natural groundwater flow rates, thus we exclude water levels measurements that were made after 2000 where groundwater extraction may have increasingly affected the water pressures in aquifers 13 . The spatial distribution of observation boreholes is spread relatively uniformly across the study area ( Fig. 1). 90% of inland boreholes penetrate less than 150 m deep; the remainder extend down to 2500 m predominantly in inland regions and intersect the confined aquifers of the Great Artesian Basin (Fig. 3). As such, these bores have very high pressure heads 13,27 . The integration of hydraulic head measurements within our inverse modelling framework constrains the hydraulic conductivity of major lithologies. In particular, high pressures observed within inland aquifers must be maintained by the low-permeability of overlying lithologies.
Groundwater recharge data. Groundwater recharge is difficult to quantify as it cannot be directly measured. Instead, geochemical proxies can be used to constrain recharge rates such as the chloride mass balance method 28,29 . Chloride is excluded from evapo-transpiration; thus, assuming no change in salt storage over very Table 1. Material properties assigned to major lithologies in the Sydney-Gunnedah-Bowen Basin. Bracketed numbers indicate one standard deviation from the mean-i.e. mean (std). www.nature.com/scientificreports/ long time periods, recharge, q r , can be estimated from the rate of precipitation and the ratio of chloride concentration in rainfall, Cl p , and groundwater, Cl gw , The mass of Cl into the groundwater system from precipitation, q p , is balanced by the drainage out of the system after budgeting for Cl lost to surface runoff. Equation (1) can be expanded to consider the lateral flow of chloride with surface water runoff, where Cl on and Cl off is the concentration of chloride in runon and runoff, respectively. The chloride mass balance method assumes that Cl is conservative within the system and there is no variation in chloride fluxes through time 9,30 . We incorporate 1279 recharge estimates from Cl measurements in boreholes to constrain the recharge rates of shallow rock units within SGB Basin (Table S2) 9 . These measurements are mostly clustered within the Sydney and Gunnedah portions of the SGB Basin system. The high variability of recharge rates (0.001-2 mm/day) can be attributed to heterogeneous surface geology, which have hydraulic conductivities that differ by orders of magnitude (Table 1). Cl fluxes provide valuable constraints for recharge rates within the basin. By assimilating these recharge data across the entire basin, with their respective uncertainties, we can constrain the hydraulic conductivity of major aquifers and estimate groundwater flow rates.
Borehole temperature data. We use borehole temperature data to indirectly estimate groundwater flow velocities from their perturbation of temperature gradients. These extend up to 3 km deep, which resolves deeper long-range flow velocities than can be observed from groundwater monitoring boreholes that rarely extend below 300 m. This is possible because upward and downward groundwater flow results in lower and higher geothermal gradients, respectively, which are observed where aquifers and boreholes intersect 31,32 . Depending on the temperature differential and the hydraulic conductivity of an aquifer, thermal perturbations measured within boreholes may pervade a large region 33 . Temperature gradients vary considerably within the SGB Basin (Fig. 4). The average temperature gradient within 268 boreholes is approximately 26 • C/km and most of the data are clustered within the Sydney and Gunnedah Basins 19 (Table S3). Some are derived from industry data that have been filtered for unequilibrated bottom-hole temperature measurements. Palaeoclimate corrections have been applied to each borehole consistent with the surface temperature history of the region 19 .

Results
We identified the optimal model of groundwater flow and thermal structure from our ensemble as the model that obtains the best fit between available hydrologic and thermal data with prior information (Table 1). This model, the maximum a posteriori (MAP) model, of lithology-dependent hydraulic conductivity, thermal conductivity, rate of heat production, and lower temperature boundary condition was found after 6600 evaluations of the forward model using the inversion approach outlined in the "Methods" section. We obtain identical MAP estimates   34 , which suggests we have identified the global minimum of the objective function ("Methods" section). For the most part, we are able to reproduce the high degree of variability between our model and the data (Fig. 5). The model misfits (measured as ℓ 2 -norm) follow a logarithmic decay for each dataset with the exception of the temperature gradients. These are skewed towards zero, but do not decay logarithmically with increasing misfit because (i) there are relatively few measurements ( n = 165 ), (ii) much of the data is very shallow and are taken within interbedded coal-sand layers, and (iii) temperature gradients can vary considerably depending on circulation time after drilling 19 . Groundwater recharge velocities are heavily skewed towards zero misfit within the Sydney Basin, and larger misfits proximal to the coast. Smaller misfits observed within the highlands indicate that the recharge of groundwater into deeper aquifers of the SGB Basin are concordant with observations, whereas coastal outflow points are of lesser concern towards the very end of their flow pathways out to the ocean. Conversely, low misfits between our simulation and the hydraulic head observed in groundwater monitoring boreholes is more important for inland regions Figure 5. Map of misfit between simulated groundwater pressure, velocity and temperature gradients (from the MAP estimate) with borehole observations. The ℓ 2 -norm misfit histograms indicate the optimal fit between model and data. These maps were generated from our models using Cartopy 0.19 (https:// scito ols. org. uk/ carto py). where much of the agricultural production is located, rather than highland regions. Seasonal rainfall exerts a much stronger influence on variation in water levels observed at recharge points, which helps to explain the larger misfits of hydraulic head along the eastern highlands. The posterior probability distribution of each parameter is given in Fig. 6. The lower temperature boundary condition, T 1 , is most sensitive to the temperature data: from a uniform prior distribution between 100-900 • C, we obtain T 1 = 521 ± 98 • C a posteriori. The values of lithology-dependent hydraulic conductivity are mostly constrained from groundwater recharge rates we assimilated from Cl concentrations 9 , and hydraulic head measurements in NGIS groundwater monitoring bores. Lithologies with the largest volume in our model are the most constrained since they have a proportionally larger impact on groundwater flow velocities and water levels observed at monitoring sites. Furthermore, the proportion of hydraulic head measurements within lithologies of higher permeability is considerably greater than other lithologies. The Jurassic sediments, basin metasediments, and Late Carboniferous volcanic layers are best constrained by the data.
From the MAP estimate, we obtained the simulated temperature, fluid pressure head, and velocity fields for the SGB Basin (Fig. 7). Flow rates reach a peak of 0.3 m/day at the discharge zone of the Sydney Basin along the continental shelf (Fig. 7A). Depth slices through the basin highlight the carrying ability of the aquifer network to transport groundwater from recharge zones in nearby mountain ranges and also long-range flow from inland regions (Fig. 8). The mountain ranges broadly delineate coastal aquifers from inland aquifers that possess much slower flow rates around 0.005 m/day. These inland aquifers are 1-2 km deep which underlie alluvial sediments which incur the most groundwater pumping. The average distance between recharge and discharge zones in inland aquifers is 700 km, which corresponds to a groundwater residence time of approximately 400,000 years. Contrast this with coastal aquifers, where the average transport distance of an aquifer is 200 km and seepage velocity of 0.3 m/day equates to a groundwater residence time of just 2,000 years. The disconnect between recharge inputs and potential extraction is a significant feature that can only be captured using large-scale basin modelling and is important for groundwater management.
Numerical simulations of flow and heat transport in confined and unconfined carbonates in Hungary reveals the dominance of topography-driven flow in redistributing heat 35 . Here, the coal measures within the Sydney-Gunnedah-Bowen Basin are very effective thermal insulators (0.3 ± 0.1 W m −1 K −1 ), which refract heat toward more thermally conductive sedimentary layers ( Figure 7B). Exceptionally high heat flow values are modelled around the rim of coal seams (100 mW m −2 ), which is comparable to high heat flow measured in Proterozoic regions enriched in high heat-producing elements 36 or Phanerozoic regions associated with recent volcanism ( ≤ 5 Ma) 37 . These results significantly improve the modelling of 3D heat-flow paths that is consistent across the entire basin. Earlier studies of the regional thermal regime, which primarily relied on forward modelling 2D www.nature.com/scientificreports/ cross sections to borehole temperature data 19,33 or downward extrapolation of bottom-hole temperatures to 5 km depth used in geothermal prospectivity maps 38 . The groundwater flow field augments the conductive regime by transporting heat from coal seams toward discharge zones on either side of the mountain range. By performing coupled thermal-hydraulic inversions, our model encapsulates more data that are sensitive to temperature variation than what can be gleaned from heat flow data alone.

Discussion
The modelling approach we have taken applies large scale numerical modelling of coupled groundwater-heat flow to the SGB Basin using a Bayesian inverse framework. Our models jointly invert the optimal configuration of hydrologic and thermal parameters that best fit the data. The model we have developed can be used to probe the steady-state dynamics of groundwater flow within the SGB Basin and can be a reference point for future time-dependent models that vary groundwater extraction and recharge rates. The framework we have developed may be extended to other regions with little difficulty.  www.nature.com/scientificreports/ with abundant groundwater due to high recharge from the eastern highlands of Australia. Discharge from the continental shelf into the Pacific Ocean occur at a maximum rate of 0.3 m/day from the Sydney Basin aquifer system. Due to this high rate of discharge and close proximity to recharge zones, it is less prevalent for saline marine water to contaminate coastal aquifers observed elsewhere in the world 39,40 , with only a small number of coastal extraction sites in Newcastle and Queensland reporting saline intrusion 41 . In comparison, flow velocities within inland deep aquifers are between 10 −5 -10 −3 m /day and have a residence times of approximately 400,000 years. Near-surface alluvial aquifers not modelled here may have higher flow rates and be a more viable groundwater resource, but are also susceptible to seasonal variability in recharge rates and potentially complex interaction with deeper aquifers. The flow within the deep aquifers in our model provides an important constraint for understanding leakage and upward pressure.
Anthropogenic groundwater extraction. These groundwater flow models infer steady-state fluid velocities from hydraulic data collected before groundwater extraction began to increase significantly post-2000 13,27 . By assimilating data averaged over a long timescale up to 2000, the model we have presented is as close to the natural groundwater dynamics as is possible from the data we have available. In more recent years the rate of extraction has increased throughout the basin 42 , particularly due to the millenium drought, and the water depth measured in groundwater bores have departed from pre-2000 levels (Fig. 9). On average, the entire basin has experienced a 7 m decline in water levels and up to 17 m in agricultural regions west of the highlands (Fig. 10B). Borehole water levels have plateaued in the years since 2010 and if this reaches a new steady state, the storage capacity of near-surface aquifers will have decreased by 4,000,000 GL over the entire model domain. A La Niña event between 2010-2011 brought considerably more precipitation to Eastern Australia which may have increased the decadal average between 2010 and 2020, so the reduction in storage capacity of these aquifers may be even further reduced from our estimates. Re-computing the simulation using a reduction in water table consistent with the change in groundwater levels over 2010-2020 (Table S4) significantly alters the flow patterns from our previous model inferred from pre-2000 water level data (Table S1). The local draw-down of the water table increases the hydraulic head gradient with underlying aquifers, and enhances recharge to the alluvial sediments. Dark red streamlines in Fig. 10A, which represent steady-state flow patterns between 2010 and 2020, reveal a significant departure from pre-2000 streamlines (in dark grey). The fluid pathways are longer as groundwater is drawn further from recharge zones to replenish alluvial sediments that have been pumped since 2000. This is particularly prevalent in inland regions where permeable interbedded Permian sediment and coal layers recharge Jurassic sediments, which in turn recharge alluvial sediments. Isotopic tracers indicate a mixing of groundwaters with residence times of < 70 years and hundreds of thousands of years old, which is consistent with that of the Great Artesian Basin 16 . In keeping with the trace isotope inter-connectivity of alluvial sediments in inland regions, such as the Namoi Valley 21 , with deeper aquifers that comprise the SGB Basin and Great Artesian Basin, our modelling is consistent with the hypothesis that shallow aquifers are recharged by leakage from deeper aquifers.
Changing groundwater flow pathways. Comparing the variation in velocity along inland flow pathways reveals that aquifers between 1-2 km depth experience 20% higher velocities as a result of groundwater extraction since 2000 (Fig. 10C). These higher velocities are a consequence of deeper aquifers of the SGB and Great Artesian Basin recharging alluvial sediments in certain parts of the SGB where significant pumping has occurred since 2000. The velocity variation at the surface is not uniform: highland regions generally experience a 0-30% increase in flow rates as deeper aquifers are recharged; in contrast some regions that were discharge zones now divert groundwater further west of the highlands by up to 60% of pre-2000 flow rates. The implications of this are two-fold: first, the reduced water table encourages recharge from rivers and lakes into surrounding aquifers, thereby reducing surface water supplies 43 . We find that the water table beneath major perennial streams deepened by an average of 7 m since 2000 (Fig. 10B), which can increase recharge from increasingly drier surface water resources on the western side of the Australian highlands 44 . Second, the diversion of flow pathways from perennial streams since 2000 has important consequences for worsening salinity conditions in upstream catchments and abandoned discharge zones during drought conditions 15 . Increasing groundwater and river salinity directly can directly affect agricultural production in inland regions, particularly where established groundwater flow pathways have been disconnected as a result of pumping. The consequences of groundwater pumping within coastal aquifers is relatively diminished due to higher annual rainfall and lower groundwater residence times. Our simulation assumes that the water table is in equilibrium with recharge from rainfall and extraction from groundwater bores; however, groundwater flow through the alluvial sediments occurs over much shorter timescales than deeper aquifers and will require time-dependent modelling for future extraction and climate scenarios. The long residence time of water within deep aquifers mitigate yearly fluctuations in recharge from precipitation, nonetheless careful management of groundwater resources is important for the continuing groundwater extraction as the Australian continent becomes drier.
Future studies could use these models to inform time-dependent modelling of groundwater flow in response to mountain building and denudation. The long residence time of deep groundwater (400,000 years) is comparable to timescales of tectonic uplift which may, in turn, be influenced by longer-timescale tectonics. Additionally, tectonics controls (i) sediment flux and compaction, and (ii) deep stress, which impacts on the porosity and permeability structure. This would shed light on groundwater dynamics in the wider context of tectonic uplift and erosion along the eastern seaboard of Australia.

Conclusions
Large-scale groundwater modelling is important to understand the long range flow dynamics of continentscale basin systems that underpins many of the more productive shallow systems. Here we developed a general numerical framework for modelling deep groundwater flow that can be applied to basins worldwide. Using a geological model of the SGB Basin, we are able to invert flow rates from borehole temperature data to quantify the groundwater residence times at different depths throughout the region. Groundwater flow in the SGB Basin is partitioned into two systems delineated by the eastern highlands of Australia. The relatively flat terrain on the inland side of the eastern highlands drives fluid through inland aquifers at a slow rate of 0.005 m/day. Integrating these flow rates across much of the inland region yields an average residence time of approximately 400,000 years. In contrast, coastal aquifers experience high flow rates up to 0.3 m/day from the Sydney Basin and an average groundwater residence time of 2000 years. Whilst seepage of seawater into coastal aquifers is significant for some parts of the world, in this context its effect is likely minimal due to the high groundwater flow rates from the eastern highlands through to the continental shelf. Perturbing the model to account for a drop in borehole water levels since 2000, we find flow pathways within the SGB Basin become longer particularly within inland regions. A 17 m drop in water levels from groundwater extraction is accommodated by increased recharge by 20% from deeper aquifers of the Great Artesian Basin into surface alluvial sediments, in line with isotopic evidence of groundwater mixing. The insights provided from this study will help to inform an integrated management Figure 9. Change in water depth measured in groundwater boreholes pre-2000 to 2020 (negative values indicate artesian pressures translated to meters above the ground surface). These maps were generated from our models using Cartopy 0.19 (https:// scito ols. org. uk/ carto py).

Methods
Temperature advection due to groundwater flow is described by the advection-diffusion equation, which in steady-state is, Figure 10. Comparison between regional flow patterns using water table data from pre-2000 to 2010-2020. (A) streamlines illustrate the change in regional flow pathways from pre-2000 (dark grey) to 2010-2020 (dark red), where the thickness of each streamline indicates the relative velocity. This map was generated from our models using Cartopy 0.19 (https:// scito ols. org. uk/ carto py). (B) Decline in water levels measured in groundwater boreholes from pre-2000 to 2020. The "agricultural region" is the western portion of the model that comprises part of the Great Artesian Basin (west of 150 • longitude between 25-32 • S). (C) Flow velocities have increased on average by 10% between 1-2 km depth, which indicates that deeper aquifers of the GAB are recharging alluvial sediments within inland regions to replenish water that has been pumped since 2000. where k T is the thermal conductivity, H is the volumetric source term, c p is the specific heat capacity, q is the flow velocity, and C = ρ w C p w is the product between the density and heat capacity of water. Darcy flux, q , may be calculated from the groundwater flow equation, which in steady state is, where h is the hydraulic head and k h is the hydraulic conductivity. The coupling is implemented by first solving Eq. (5) and substituting the velocity, q , into Eq. (3) to solve for temperature. Since ∇T changes with every solution, Eq. (3) is non-linear and requires several iterations to converge to a final temperature solution. We implement this model in Underworld, a finite element particle-in-cell software framework for geodynamics 18  Boundary conditions. Groundwater recharge and discharge are driven by changes in hydraulic head. The hydraulic head prescribed to the top boundary surface is set to the height of the water Table 45 . In practise, the water table closely resembles a smoother version of the topography without high amplitude hills or mountains (Fig. 1b) and hugs the water level observed in boreholes within shallow (< 200 m deep) unconfined aquifers from the NGIS database. The data were fitted using a weighted spline using the Stripy Python package 46 . The thermal boundary conditions include a constant temperature set to the top boundary ( T 0 = 18 • C), which corresponds to the annual mean surface temperature. The side walls are assigned zero flux and the bottom temperature boundary, T 1 , is an unknown variable that we invert from borehole temperature data within our Bayesian optimisation scheme described below.
Temperature-dependent thermal conductivity. Thermal conductivity is dependent on temperature using an Arrhenius law 47 , where T 0 is the surface temperature (usually close to 298 K), k T 0 is the thermal conductivity at the surface, and a = 0.33 for a dominantly silicate composition or 0.25 for mafic rocks. A list of thermal conductivity values relevant to major lithologies in the Sydney-Gunnedah-Bowen Basin are summarised in Table 1. Depth-dependent hydraulic conductivity. As rocks become more compressed with depth, their permeability decreases and fluid flow is impeded. The reduction in hydraulic conductivity is generally proportional to depth, d 48,49 , where k h 0 is the ground surface hydraulic conductivity that may vary considerably depending on the rock type. For instance, sediments like sandstone may have a hydraulic conductivity that is 3-5 magnitudes higher than volcanic rocks. A list of hydraulic conductivity values relevant to major lithologies in the Sydney-Gunnedah-Bowen Basin are summarised in Table 1. Depth-dependent porosity. The Darcy flux, q , is not equivalent to the fluid velocity, v . Instead, the velocity at which fluid is flowing through the pores is related to the porosity, ϕ, where 0 < ϕ < 1 . The pore space within rocks decrease as they become more compressed with depth. We apply an empirical porosity-depth relationship to our simulations 50 , where ϕ 0 is the porosity at ground surface ( z = 0 ), and m and n are fitting constants which are 0.071 and 5.989, respectively for continental crust.
Formulation of the inverse problem. The inverse solution is given by the a posteriori probability function, P(m|d) , which is proportional to the product of the likelihood function, P(d|m) , and the a priori probability P(m).
The likelihood is the probability of reproducing the data d given a particular model m , and the a priori probability is prior knowledge about the model before assimilating the data. In this case, the model parameters correspond (4) ∇ · q = 0 (5) ∇ · (k h ∇h) = 0 www.nature.com/scientificreports/ to the lower temperature boundary condition T 1 and material properties ( k T , k h , H) assigned to discreet lithologies in the Sydney-Gunnedah-Bowen Basin; the data corresponds to temperature gradients collected from boreholes, T d , and groundwater recharge rates estimated from chloride mass balance methods, q d . Therefore, P(m|d) = P(k T , k h , H, T 1 |�T d , q d ) . The priors for the lithologies are taken from Table 1 and the lower boundary condition is a uniform distribution between 100-1000 • C. The posterior probability can be evaluated through an objective function, S(m) , which compares the misfit between data and prior information, where A is a constant. The maximum a posterori (MAP) estimate is obtained by minimising the ℓ 2 -norm objective function if the priors and data are both uncorrelated, where g is the forward operator, which is the prediction of observations from the model, m . Here, this is the computation of steady-state temperature augmented by the groundwater flow field from which we compare the misfit between temperature profiles from our model and those measured in boreholes.
Optimisation scheme. We use the differential evolution algorithm for finding the global minimum of S(m) 34 . This resides within the family of genetic algorithms that traverse a probability space much more efficiently than traditional Markov-Chain Monte Carlo (MCMC) techniques such as Metropolis Hastings. In differential evolution, multiple starting points ( m 0 ) are initialised from a Latin hypercube and their local probability space is explored using a MCMC sampler. After a specified number of iterations, multiple minima are culled to a subset of promising candidates, which are further refined using gradient-based optimisation. For this last step, convergence to a local minimum is reached when either |m n − m n−1 | < 10 −12 or |S(m n ) − S(m n−1 )| < 10 −12 . The differential evolution algorithm will return the global minimum of S(m) provided there is a sufficient number of seed points and MCMC iterations to sample the posterior.